Adaptive Execution: Exploration and Learning of Price Impact

نویسندگان

  • Beomsoo Park
  • Benjamin Van Roy
چکیده

We consider a model in which a trader aims to maximize expected risk-adjusted profit while trading a single security. In our model, each price change is a linear combination of observed factors, impact resulting from the trader’s current and prior activity, and unpredictable random effects. The trader must learn coefficients of a price impact model while trading. We propose a new method for simultaneous execution and learning – the confidence-triggered regularized adaptive certainty equivalent (CTRACE) policy – and establish a poly-logarithmic finite-time expected regret bound. This bound implies that CTRACE is efficient in the sense that the ( , δ)-convergence time is bounded by a polynomial function of 1/ and log(1/δ) with high probability. In addition, we demonstrate via Monte Carlo simulation that CTRACE outperforms the certainty equivalent policy and a recently proposed reinforcement learning algorithm that is designed to explore efficiently in linear-quadratic control problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Execution with Online Price Impact Learning

Buying or selling a large block of security is often followed by unfavorable movement of price which is called price impact. One reason for the impact is that the block execution causes abrupt imbalance between supply and demand and the other is that it might convey to other investors information about fundamental value of the security that will be reflected on their future investment decisions...

متن کامل

On the effect of low-quality node observation on learning over incremental adaptive networks

In this paper, we study the impact of low-quality node on the performance of incremental least mean square (ILMS) adaptive networks. Adaptive networks involve many nodes with adaptation and learning capabilities. Low-quality mode in the performance of a node in a practical sensor network is modeled by the observation of pure noise (its observation noise) that leads to an unreliable measurement....

متن کامل

Efficient Exploration in Reinforcement Learning Based on Utile Suffix Memory

Reinforcement learning addresses the question of how an autonomous agent can learn to choose optimal actions to achieve its goals. Efficient exploration is of fundamental importance for autonomous agents that learn to act. Previous approaches to exploration in reinforcement learning usually address exploration in the case when the environment is fully observable. In contrast, we study the case ...

متن کامل

Mean-Variance Optimal Adaptive Execution

Electronic trading of equities and other securities makes heavy use of “arrival price” algorithms, that balance the market impact cost of rapid execution against the volatility risk of slow execution. In the standard formulation, mean-variance optimal trading strategies are static: they do not modify the execution speed in response to price motions observed during trading. We show that substant...

متن کامل

Adaptive Arrival Price

Electronic trading of equities and other securities makes heavy use of “arrival price” algorithms, that determine optimal trade schedules by balancing the market impact cost of rapid execution against the volatility risk of slow execution. In the standard formulation, mean-variance optimal strategies are static: they do not modify the execution speed in response to price motions observed during...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Operations Research

دوره 63  شماره 

صفحات  -

تاریخ انتشار 2015